90 research outputs found
Robust Restless Bandits: Tackling Interval Uncertainty with Deep Reinforcement Learning
We introduce Robust Restless Bandits, a challenging generalization of
restless multi-arm bandits (RMAB). RMABs have been widely studied for
intervention planning with limited resources. However, most works make the
unrealistic assumption that the transition dynamics are known perfectly,
restricting the applicability of existing methods to real-world scenarios. To
make RMABs more useful in settings with uncertain dynamics: (i) We introduce
the Robust RMAB problem and develop solutions for a minimax regret objective
when transitions are given by interval uncertainties; (ii) We develop a double
oracle algorithm for solving Robust RMABs and demonstrate its effectiveness on
three experimental domains; (iii) To enable our double oracle approach, we
introduce RMABPPO, a novel deep reinforcement learning algorithm for solving
RMABs. RMABPPO hinges on learning an auxiliary "-network" that allows
each arm's learning to decouple, greatly reducing sample complexity required
for training; (iv) Under minimax regret, the adversary in the double oracle
approach is notoriously difficult to implement due to non-stationarity. To
address this, we formulate the adversary oracle as a multi-agent reinforcement
learning problem and solve it with a multi-agent extension of RMABPPO, which
may be of independent interest as the first known algorithm for this setting.
Code is available at https://github.com/killian-34/RobustRMAB.Comment: 18 pages, 3 figure
Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data
Digital Adherence Technologies (DATs) are an increasingly popular method for
verifying patient adherence to many medications. We analyze data from one city
served by 99DOTS, a phone-call-based DAT deployed for Tuberculosis (TB)
treatment in India where nearly 3 million people are afflicted with the disease
each year. The data contains nearly 17,000 patients and 2.1M dose records. We
lay the groundwork for learning from this real-world data, including a method
for avoiding the effects of unobserved interventions in training data used for
machine learning. We then construct a deep learning model, demonstrate its
interpretability, and show how it can be adapted and trained in different
clinical scenarios to better target and improve patient care. In the real-time
risk prediction setting our model could be used to proactively intervene with
21% more patients and before 76% more missed doses than current heuristic
baselines. For outcome prediction, our model performs 40% better than baseline
methods, allowing cities to target more resources to clinics with a heavier
burden of patients at risk of failure. Finally, we present a case study
demonstrating how our model can be trained in an end-to-end decision focused
learning setting to achieve 15% better solution quality in an example decision
problem faced by health workers.Comment: 10 pages, 6 figure
Roflumilast in moderate-to-severe chronic obstructive pulmonary disease treated with longacting bronchodilators: two randomised clinical trials
Background Patients with chronic obstructive pulmonary disease (COPD) have few options for treatment. The efficacy and safety of the phosphodiesterase-4 inhibitor roflumilast have been investigated in studies of patients with moderate-to-severe COPD, but not in those concomitantly treated with longacting inhaled bronchodilators. The effect of roflumilast on lung function in patients with COPD that is moderate to severe who are already being treated with salmeterol or tiotropium was investigated. Methods In two double-blind, multicentre studies done in an outpatient setting, after a 4-week run-in, patients older than 40 years with moderate-to-severe COPD were randomly assigned to oral roflumilast 500 mu g or placebo once a day for 24 weeks, in addition to salmeterol (M2-127 study) or tiotropium (M2-128 study). The primary endpoint was change in prebronchodilator forced expiratory volume in 1s (FEV(1)). Analysis was by intention to treat. The studies are registered with ClinicalTrials.gov, number NCT00313209 for M2-127, and NCT00424268 for M2-128. Findings In the salmeterol plus roflumilast trial, 466 patients were assigned to and treated with roflumilast and 467 with placebo; in the tiotropium plus roflumilast trial, 371 patients were assigned to and treated with roflumilast and 372 with placebo. Compared with placebo, roflumilast consistently improved mean prebronchodilator FEV(1) by 49 mL (p<0.0001) in patients treated with salmeterol, and 80 mL (p<0.0001) in those treated with tiotropium. Similar improvement in postbronchodilator FEV(1) was noted in both groups. Furthermore, roflumilast had beneficial effects on other lung function measurements and on selected patient-reported outcomes in both groups. Nausea, diarrhoea, weight loss, and, to a lesser extent, headache were more frequent in patients in the roflumilast groups. These adverse events were associated with increased patient withdrawal. Interpretation Roflumilast improves lung function in patients with COPD treated with salmeterol or tiotropium, and could become an important treatment for these patients
The genome of the sea urchin Strongylocentrotus purpuratus
We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus
purpuratus, a model for developmental and systems biology. The sequencing strategy combined
whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones,
aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome.
The genome encodes about 23,300 genes, including many previously thought to be vertebrate
innovations or known only outside the deuterostomes. This echinoderm genome provides an
evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes
Genomic investigations of unexplained acute hepatitis in children
Since its first identification in Scotland, over 1,000 cases of unexplained paediatric hepatitis in children have been reported worldwide, including 278 cases in the UK1. Here we report an investigation of 38 cases, 66 age-matched immunocompetent controls and 21 immunocompromised comparator participants, using a combination of genomic, transcriptomic, proteomic and immunohistochemical methods. We detected high levels of adeno-associated virus 2 (AAV2) DNA in the liver, blood, plasma or stool from 27 of 28 cases. We found low levels of adenovirus (HAdV) and human herpesvirus 6B (HHV-6B) in 23 of 31 and 16 of 23, respectively, of the cases tested. By contrast, AAV2 was infrequently detected and at low titre in the blood or the liver from control children with HAdV, even when profoundly immunosuppressed. AAV2, HAdV and HHV-6 phylogeny excluded the emergence of novel strains in cases. Histological analyses of explanted livers showed enrichment for T cells and B lineage cells. Proteomic comparison of liver tissue from cases and healthy controls identified increased expression of HLA class 2, immunoglobulin variable regions and complement proteins. HAdV and AAV2 proteins were not detected in the livers. Instead, we identified AAV2 DNA complexes reflecting both HAdV-mediated and HHV-6B-mediated replication. We hypothesize that high levels of abnormal AAV2 replication products aided by HAdV and, in severe cases, HHV-6B may have triggered immune-mediated hepatic disease in genetically and immunologically predisposed children
Flexible Budgets in Restless Bandits: A Primal-Dual Algorithm for Efficient Budget Allocation
Restless multi-armed bandits (RMABs) are an important model to optimize allocation of limited resources in sequential decision-making settings. Typical RMABs assume the budget --- the number of arms pulled --- to be fixed for each step in the planning horizon. However, for realistic real-world planning, resources are not necessarily limited at each planning step; we may be able to distribute surplus resources in one round to an earlier or later round. In real-world planning settings, this flexibility in budget is often constrained to within a subset of consecutive planning steps, e.g., weekly planning of a monthly budget. In this paper we define a general class of RMABs with flexible budget, which we term F-RMABs, and provide an algorithm to optimally solve for them. We derive a min-max formulation to find optimal policies for F-RMABs and leverage gradient primal-dual algorithms to solve for reward-maximizing policies with flexible budgets. We introduce a scheme to sample expected gradients to apply primal-dual algorithms to the F-RMAB setting and make an otherwise computationally expensive approach tractable. Additionally, we provide heuristics that trade off solution quality for efficiency and present experimental comparisons of different F-RMAB solution approaches
- …